Bayesian nonlinear regression for large p small n problems

نویسندگان

  • Sounak Chakraborty
  • Malay Ghosh
  • Bani K. Mallick
چکیده

Statistical modelling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. We develop nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik’s 2-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS). This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also considered the relevance vector machine (RVM) introduced by Bishop, Tipping and others. Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We apply our model for prediction of blood glucose concentration in diabetics using florescence based optics. We have extended the full Bayesian support vector regression (SVR) and relevance vector regression (RVR) models when the response is multivariate. We have also proposed an empirical Bayes RVM and SVM. The multivariate version of the SVM and RVM is illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data

Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...

متن کامل

The Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models

In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

Towards Bayesian experimental design for nonlinear models that require a large number of sampling times

The use of Bayesian methodologies for solving optimal experimental design problems has increased. Many of these methods have been found to be computationally intensive for design problems that require a large number of design points. A simulation-based approach that can be used to solve optimal design problems in which one is interested in finding a large number of (near) optimal design points ...

متن کامل

پایش پروفایل های غیر خطی در فاز II با استفاده از موجکها

In many industrial processes, quality of a process can be characterized as a nonlinear relation between a response variable and explanatory variables. In several articles, use of nonlinear regression is suggested for monitoring nonlinear profiles. Such regression has two disadvantages. First the distribution of the regression coefficients cannot be specified for small samples and second with in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Multivariate Analysis

دوره 108  شماره 

صفحات  -

تاریخ انتشار 2012